Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Batch, Off-policy and Model-free Apprenticeship Learning

Identifieur interne : 002141 ( Main/Exploration ); précédent : 002140; suivant : 002142

Batch, Off-policy and Model-free Apprenticeship Learning

Auteurs : Edouard Klein [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]

Source :

RBID : Hal:hal-00660623

Abstract

This paper addresses the problem of apprenticeship learning, that is learning control policies from demonstration by an expert. An efficient framework for it is inverse reinforcement learning (IRL). Based on the assumption that the expert maximizes a utility function, IRL aims at learning the underlying reward from example trajectories. Many IRL algorithms assume that the reward function is linearly parameterized and rely on the computation of some associated feature expectations, which is done through Monte Carlo simulation. However, this assumes to have full trajectories for the expert policy as well as at least a generative model for intermediate policies. In this paper, we introduce a temporal difference method, namely LSTD-mu, to compute these feature expectations. This allows extending apprenticeship learning to a batch and off-policy setting.

Url:


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Batch, Off-policy and Model-free Apprenticeship Learning</title>
<author>
<name sortKey="Klein, Edouard" sort="Klein, Edouard" uniqKey="Klein E" first="Edouard" last="Klein">Edouard Klein</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-24433" status="OLD">
<orgName>Machine Learning and Computational Biology</orgName>
<orgName type="acronym">ABC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche/equipes/abc</ref>
</desc>
<listRelation>
<relation active="#struct-160" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-300291" type="indirect"></relation>
<relation active="#struct-300292" type="indirect"></relation>
<relation active="#struct-300293" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-160" type="direct">
<org type="laboratory" xml:id="struct-160" status="OLD">
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-300291" type="direct"></relation>
<relation active="#struct-300292" type="direct"></relation>
<relation active="#struct-300293" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300291" type="indirect">
<org type="institution" xml:id="struct-300291" status="OLD">
<orgName>Université Henri Poincaré - Nancy 1</orgName>
<orgName type="acronym">UHP</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<addrLine>24-30 rue Lionnois, BP 60120, 54 003 NANCY cedex, France</addrLine>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300292" type="indirect">
<org type="institution" xml:id="struct-300292" status="OLD">
<orgName>Université Nancy 2</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<addrLine>91 avenue de la Libération, BP 454, 54001 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300293" type="indirect">
<org type="institution" xml:id="struct-300293" status="OLD">
<orgName>Institut National Polytechnique de Lorraine</orgName>
<orgName type="acronym">INPL</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Nancy 2</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Institut national polytechnique de Lorraine</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Geist, Matthieu" sort="Geist, Matthieu" uniqKey="Geist M" first="Matthieu" last="Geist">Matthieu Geist</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Pietquin, Olivier" sort="Pietquin, Olivier" uniqKey="Pietquin O" first="Olivier" last="Pietquin">Olivier Pietquin</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">HAL</idno>
<idno type="RBID">Hal:hal-00660623</idno>
<idno type="halId">hal-00660623</idno>
<idno type="halUri">https://hal-supelec.archives-ouvertes.fr/hal-00660623</idno>
<idno type="url">https://hal-supelec.archives-ouvertes.fr/hal-00660623</idno>
<date when="2011-09-09">2011-09-09</date>
<idno type="wicri:Area/Hal/Corpus">001200</idno>
<idno type="wicri:Area/Hal/Curation">001200</idno>
<idno type="wicri:Area/Hal/Checkpoint">001B98</idno>
<idno type="wicri:explorRef" wicri:stream="Hal" wicri:step="Checkpoint">001B98</idno>
<idno type="wicri:Area/Main/Merge">002185</idno>
<idno type="wicri:Area/Main/Curation">002141</idno>
<idno type="wicri:Area/Main/Exploration">002141</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Batch, Off-policy and Model-free Apprenticeship Learning</title>
<author>
<name sortKey="Klein, Edouard" sort="Klein, Edouard" uniqKey="Klein E" first="Edouard" last="Klein">Edouard Klein</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-24433" status="OLD">
<orgName>Machine Learning and Computational Biology</orgName>
<orgName type="acronym">ABC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr/la-recherche/equipes/abc</ref>
</desc>
<listRelation>
<relation active="#struct-160" type="direct"></relation>
<relation name="UMR7503" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-300009" type="indirect"></relation>
<relation active="#struct-300291" type="indirect"></relation>
<relation active="#struct-300292" type="indirect"></relation>
<relation active="#struct-300293" type="indirect"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-160" type="direct">
<org type="laboratory" xml:id="struct-160" status="OLD">
<orgName>Laboratoire Lorrain de Recherche en Informatique et ses Applications</orgName>
<orgName type="acronym">LORIA</orgName>
<desc>
<address>
<addrLine>Campus Scientifique BP 239 54506 Vandoeuvre-lès-Nancy Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.loria.fr</ref>
</desc>
<listRelation>
<relation name="UMR7503" active="#struct-441569" type="direct"></relation>
<relation active="#struct-300009" type="direct"></relation>
<relation active="#struct-300291" type="direct"></relation>
<relation active="#struct-300292" type="direct"></relation>
<relation active="#struct-300293" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle name="UMR7503" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300009" type="indirect">
<org type="institution" xml:id="struct-300009" status="VALID">
<orgName>Institut National de Recherche en Informatique et en Automatique</orgName>
<orgName type="acronym">Inria</orgName>
<desc>
<address>
<addrLine>Domaine de VoluceauRocquencourt - BP 10578153 Le Chesnay Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.inria.fr/en/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300291" type="indirect">
<org type="institution" xml:id="struct-300291" status="OLD">
<orgName>Université Henri Poincaré - Nancy 1</orgName>
<orgName type="acronym">UHP</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<addrLine>24-30 rue Lionnois, BP 60120, 54 003 NANCY cedex, France</addrLine>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300292" type="indirect">
<org type="institution" xml:id="struct-300292" status="OLD">
<orgName>Université Nancy 2</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<addrLine>91 avenue de la Libération, BP 454, 54001 Nancy cedex</addrLine>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300293" type="indirect">
<org type="institution" xml:id="struct-300293" status="OLD">
<orgName>Institut National Polytechnique de Lorraine</orgName>
<orgName type="acronym">INPL</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Nancy 2</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
<placeName>
<settlement type="city">Nancy</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Institut national polytechnique de Lorraine</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Geist, Matthieu" sort="Geist, Matthieu" uniqKey="Geist M" first="Matthieu" last="Geist">Matthieu Geist</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
<author>
<name sortKey="Pietquin, Olivier" sort="Pietquin, Olivier" uniqKey="Pietquin O" first="Olivier" last="Pietquin">Olivier Pietquin</name>
<affiliation wicri:level="1">
<hal:affiliation type="researchteam" xml:id="struct-394500" status="INCOMING">
<orgName>IMS - Equipe Information, Multimodalité et Signal</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
<listRelation>
<relation active="#struct-24541" type="direct"></relation>
<relation active="#struct-242365" type="indirect"></relation>
<relation active="#struct-411575" type="indirect"></relation>
<relation active="#struct-301991" type="indirect"></relation>
<relation active="#struct-301990" type="indirect"></relation>
<relation active="#struct-300812" type="indirect"></relation>
<relation active="#struct-300413" type="indirect"></relation>
<relation active="#struct-300289" type="indirect"></relation>
<relation name="UMI2958" active="#struct-441569" type="indirect"></relation>
<relation active="#struct-26305" type="direct"></relation>
</listRelation>
<tutelles>
<tutelle active="#struct-24541" type="direct">
<org type="laboratory" xml:id="struct-24541" status="VALID">
<idno type="RNSR">200619366D</idno>
<orgName>Georgia Tech - CNRS [Metz]</orgName>
<orgName type="acronym">UMI2958</orgName>
<desc>
<address>
<addrLine>Metz Technopôle 2-3 rue Marconi 57070 METZ</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.umi2958.eu</ref>
</desc>
<listRelation>
<relation active="#struct-242365" type="direct"></relation>
<relation active="#struct-411575" type="direct"></relation>
<relation active="#struct-301991" type="direct"></relation>
<relation active="#struct-301990" type="direct"></relation>
<relation active="#struct-300812" type="direct"></relation>
<relation active="#struct-300413" type="direct"></relation>
<relation active="#struct-300289" type="direct"></relation>
<relation name="UMI2958" active="#struct-441569" type="direct"></relation>
</listRelation>
</org>
</tutelle>
<tutelle active="#struct-242365" type="indirect">
<org type="institution" xml:id="struct-242365" status="VALID">
<idno type="IdRef">026403188</idno>
<idno type="ISNI">0000 0001 2188 3779 </idno>
<orgName>Université de Franche-Comté</orgName>
<orgName type="acronym">UFC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.univ-fcomte.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-411575" type="indirect">
<org type="institution" xml:id="struct-411575" status="VALID">
<orgName>CentraleSupélec</orgName>
<desc>
<address>
<addrLine>3, rue Joliot Curie,Plateau de Moulon,91192 GIF-SUR-YVETTE Cedex</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.centralesupelec.fr</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301991" type="indirect">
<org type="institution" xml:id="struct-301991" status="VALID">
<orgName>Georgia Tech Lorraine</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-301990" type="indirect">
<org type="institution" xml:id="struct-301990" status="VALID">
<orgName>Georgia Institute of Technology [Atlanta]</orgName>
<desc>
<address>
<addrLine>North Avenue, Atlanta, GA 30332</addrLine>
<country key="US"></country>
</address>
<ref type="url">http://www.gatech.edu/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300812" type="indirect">
<org type="institution" xml:id="struct-300812" status="VALID">
<orgName>SUPELEC</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300413" type="indirect">
<org type="institution" xml:id="struct-300413" status="VALID">
<orgName>Ecole Nationale Supérieure des Arts et Metiers Metz</orgName>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle active="#struct-300289" type="indirect">
<org type="institution" xml:id="struct-300289" status="OLD">
<orgName>Université Paul Verlaine - Metz</orgName>
<orgName type="acronym">UPVM</orgName>
<date type="end">2011-12-31</date>
<desc>
<address>
<country key="FR"></country>
</address>
</desc>
</org>
</tutelle>
<tutelle name="UMI2958" active="#struct-441569" type="indirect">
<org type="institution" xml:id="struct-441569" status="VALID">
<idno type="IdRef">02636817X</idno>
<idno type="ISNI">0000000122597504</idno>
<orgName>Centre National de la Recherche Scientifique</orgName>
<orgName type="acronym">CNRS</orgName>
<date type="start">1939-10-19</date>
<desc>
<address>
<country key="FR"></country>
</address>
<ref type="url">http://www.cnrs.fr/</ref>
</desc>
</org>
</tutelle>
<tutelle active="#struct-26305" type="direct">
<org type="laboratory" xml:id="struct-26305" status="VALID">
<orgName>SUPELEC-Campus Metz</orgName>
<desc>
<address>
<addrLine>2 rue Edouard Belin 57070 Metz</addrLine>
<country key="FR"></country>
</address>
<ref type="url">http://www.metz.supelec.fr/metz/</ref>
</desc>
<listRelation>
<relation active="#struct-300812" type="direct"></relation>
</listRelation>
</org>
</tutelle>
</tutelles>
</hal:affiliation>
<country>France</country>
<placeName>
<settlement type="city" wicri:auto="siege">Besançon</settlement>
<region type="region" nuts="2">Franche-Comté</region>
</placeName>
<orgName type="university">Université de Franche-Comté</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Bourgogne Franche-Comté</orgName>
<placeName>
<settlement type="city">Metz</settlement>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
</placeName>
<orgName type="university">Université Paul Verlaine - Metz</orgName>
<orgName type="institution" wicri:auto="newGroup">Université de Lorraine</orgName>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">This paper addresses the problem of apprenticeship learning, that is learning control policies from demonstration by an expert. An efficient framework for it is inverse reinforcement learning (IRL). Based on the assumption that the expert maximizes a utility function, IRL aims at learning the underlying reward from example trajectories. Many IRL algorithms assume that the reward function is linearly parameterized and rely on the computation of some associated feature expectations, which is done through Monte Carlo simulation. However, this assumes to have full trajectories for the expert policy as well as at least a generative model for intermediate policies. In this paper, we introduce a temporal difference method, namely LSTD-mu, to compute these feature expectations. This allows extending apprenticeship learning to a batch and off-policy setting.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Franche-Comté</li>
<li>Grand Est</li>
<li>Lorraine (région)</li>
</region>
<settlement>
<li>Besançon</li>
<li>Metz</li>
<li>Nancy</li>
</settlement>
<orgName>
<li>Institut national polytechnique de Lorraine</li>
<li>Université Nancy 2</li>
<li>Université Paul Verlaine - Metz</li>
<li>Université de Bourgogne Franche-Comté</li>
<li>Université de Franche-Comté</li>
<li>Université de Lorraine</li>
</orgName>
</list>
<tree>
<country name="France">
<region name="Grand Est">
<name sortKey="Klein, Edouard" sort="Klein, Edouard" uniqKey="Klein E" first="Edouard" last="Klein">Edouard Klein</name>
</region>
<name sortKey="Geist, Matthieu" sort="Geist, Matthieu" uniqKey="Geist M" first="Matthieu" last="Geist">Matthieu Geist</name>
<name sortKey="Pietquin, Olivier" sort="Pietquin, Olivier" uniqKey="Pietquin O" first="Olivier" last="Pietquin">Olivier Pietquin</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 002141 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 002141 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Hal:hal-00660623
   |texte=   Batch, Off-policy and Model-free Apprenticeship Learning
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022